D-Confidence: An Active Learning Strategy which Efficiently Identifies Small Classes

نویسندگان

  • Nuno Escudeiro
  • Alipio Jorge
چکیده

In some classification tasks, such as those related to the automatic building and maintenance of text corpora, it is expensive to obtain labeled examples to train a classifier. In such circumstances it is common to have massive corpora where a few examples are labeled (typically a minority) while others are not. Semi-supervised learning techniques try to leverage the intrinsic information in unlabeled examples to improve classification models. However, these techniques assume that the labeled examples cover all the classes to learn which might not stand. In the presence of an imbalanced class distribution getting labeled examples from minority classes might be very costly if queries are randomly selected. Active learning allows asking an oracle to label new examples, that are criteriously selected, and does not assume a previous knowledge of all classes. D-Confidence is an active learning approach that is effective when in presence of imbalanced training sets. In this paper we discuss the performance of dConfidence over text corpora. We show empirically that d-Confidence reduces the number of queries required to identify examples from all classes to learn when compared to confidence, a common active learning criterion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Team-Based Learning A New Strategy in Integrated Medical Curriculum: The experience of School of Medicine, Tehran University of Medical Sciences

Introduction: Revising medical curricula & establishing reformed programs result in refined educational methods which are accompanied by development of active & student-based teaching methods such as Team Based Learning (TBL). This paper describes an experience of implementation and evaluation findings of TBL according to medical students’ viewpoints. Methods: This action research was planned ...

متن کامل

Active Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion Recognition

Data scarcity is an ever crucial problem in the field of acoustic emotion recognition. How to get the most informative data from a huge amount of data by least human work and at the same time to obtain the highest performance is quite important. In this paper, we propose and investigate two active learning strategies in acoustic emotion recognition: Based on sparse instances or based on classif...

متن کامل

Book Review: "Learning Strategy Instruction in the Language Classroom: Issues and Implementation"

Language learning strategies, “the techniques or devices which a learner may use to acquire knowledge” (Rubin, 1975, p. 43) or more pertinently “complex, dynamic thoughts and actions, selected and used by learners with some degree of consciousness in specific contexts” (Oxford, 2017, p. 48), have been widely researched and discussed for more than forty years since the mid-1970s. Shifting the fo...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Language Learning Strategy Use and Instruction for the Iranian Junior High School EFL Learners: A Mixed Methods Approach

In order to confirm the effectiveness of language learning strategies in theIranian context in junior high schools, this study was designed to examine thepatterns of strategy use, the effects of strategy instruction on the students’ strategyuse, and the relationship between the participants’ strategy use and their Englishachievement. To achieve this objective, 57 junior high school participants...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010